-
Notifications
You must be signed in to change notification settings - Fork 583
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
many: handlers and backend for kernel-modules components #13512
many: handlers and backend for kernel-modules components #13512
Conversation
The backend will create the necessary mount units for kernel-modules component type. These units need to be available early on boot so udevd and systemd-modules-load.service can access the contained modules and firmware. Additionally, depmod is invoked to recreate the modinfo files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm confused/worried about the consequences of the removing part of the setup. Maybe the description of the PR needs to explain a bit more what happens here?
overlord/snapstate/backend_test.go
Outdated
func (f *fakeSnappyBackend) SetupKernelModulesComponent(cpi snap.ContainerPlaceInfo, cref naming.ComponentRef, kernelVersion string, meter progress.Meter) (err error) { | ||
meter.Notify("setup-kernel-modules-component") | ||
f.appendOp(&fakeOp{ | ||
op: "setup-kernel-modules-component", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't we want to capture a bit more about the parameters? in retrospect we should have done so also for SetupComponent and UndoSetupComponent
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, I've fixed that now.
@@ -396,3 +398,185 @@ func (m *SnapManager) undoUnlinkCurrentComponent(t *state.Task, _ *tomb.Tomb) (e | |||
|
|||
return nil | |||
} | |||
|
|||
func kernelVersionFromPlaceInfo(spi snap.PlaceInfo) (string, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this go in the kernel package?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved now
overlord/snapstate/component.go
Outdated
undoSetupKernMod := st.NewTask("cleanup-kernel-modules-component", | ||
fmt.Sprintf(i18n.G("Clean-up previous kernel-modules component"), | ||
compSi.Component, revisionStr)) | ||
addTask(undoSetupKernMod) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what happens if we boot in between here? shouldn't we do a unit replace/reload instead of removing/cleaning up stuff?
Or use symlinks if it's simpler
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It a reboot happens between the clean-up of the old component and the creation of the new units for the new component, then the component won't be available after boot, but when the change tasks are restarted things should be fine. So whether this is critical or not depends on if we consider that having the component available early is necessary or not. And the answer is maybe yes, as in some cases we want the kernel modules to be loaded by udev rules, or maybe apps will want to load the modules when they are started.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've been taking a deeper look at this and done some changes. The observations I have are:
- There is not really a need to clean-up before installing here, as the mount units that are generated actually replace the files from the old component (mountpoints are the same). The switch happens when the new component is mounted to
/run/mnt/kernel-modules/...
. - But, if there is an undo, old units need to be restored. To do this, now I restore them in the undo step of
setup-kernel-modules-component
in case the component was in the system previous to trying the installation, otherwise units are removed as before. So in the end no new task is needed. - As the unit files are atomically created I expect that some component will be available if a reboot in-between happens and that the change can progress as expected after the reboot. But there is still a possible issue with calls to depmod, as I don't think it is atomic. However it is doubtful that the modinfo cache will have a change when updating a component while the kernel revision is not changing (at least for Canonical maintained kernels).
- To really do things as atomic as possible we should work in a separate kernel tree so depmod does not change the current cache and replace the current tree at the final step.
Still, I have kept for the moment the handlers for cleaning-up the component, we will need them for component removal anyway.
Related changes for the base in canonical/core-initrd#229 |
mounted kernel snap.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pedronis thanks for the review, I've addressed your comments now
overlord/snapstate/component.go
Outdated
undoSetupKernMod := st.NewTask("cleanup-kernel-modules-component", | ||
fmt.Sprintf(i18n.G("Clean-up previous kernel-modules component"), | ||
compSi.Component, revisionStr)) | ||
addTask(undoSetupKernMod) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've been taking a deeper look at this and done some changes. The observations I have are:
- There is not really a need to clean-up before installing here, as the mount units that are generated actually replace the files from the old component (mountpoints are the same). The switch happens when the new component is mounted to
/run/mnt/kernel-modules/...
. - But, if there is an undo, old units need to be restored. To do this, now I restore them in the undo step of
setup-kernel-modules-component
in case the component was in the system previous to trying the installation, otherwise units are removed as before. So in the end no new task is needed. - As the unit files are atomically created I expect that some component will be available if a reboot in-between happens and that the change can progress as expected after the reboot. But there is still a possible issue with calls to depmod, as I don't think it is atomic. However it is doubtful that the modinfo cache will have a change when updating a component while the kernel revision is not changing (at least for Canonical maintained kernels).
- To really do things as atomic as possible we should work in a separate kernel tree so depmod does not change the current cache and replace the current tree at the final step.
Still, I have kept for the moment the handlers for cleaning-up the component, we will need them for component removal anyway.
overlord/snapstate/backend_test.go
Outdated
func (f *fakeSnappyBackend) SetupKernelModulesComponent(cpi snap.ContainerPlaceInfo, cref naming.ComponentRef, kernelVersion string, meter progress.Meter) (err error) { | ||
meter.Notify("setup-kernel-modules-component") | ||
f.appendOp(&fakeOp{ | ||
op: "setup-kernel-modules-component", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, I've fixed that now.
@@ -396,3 +398,185 @@ func (m *SnapManager) undoUnlinkCurrentComponent(t *state.Task, _ *tomb.Tomb) (e | |||
|
|||
return nil | |||
} | |||
|
|||
func kernelVersionFromPlaceInfo(spi snap.PlaceInfo) (string, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved now
Codecov ReportAttention:
❗ Your organization needs to install the Codecov GitHub app to enable full functionality. Additional details and impacted files@@ Coverage Diff @@
## master #13512 +/- ##
==========================================
- Coverage 79.00% 78.88% -0.12%
==========================================
Files 1030 1032 +2
Lines 130360 131050 +690
==========================================
+ Hits 102986 103375 +389
- Misses 20968 21237 +269
- Partials 6406 6438 +32
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for the changes, did another pass
kernel/kernel.go
Outdated
return "", fmt.Errorf("internal error: %w", err) | ||
} | ||
if len(matches) != 1 { | ||
return "", fmt.Errorf("number of matches for %s is %d", matchPath, len(matches)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe:
unexpected number of matches (%d) for kernel %s ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed
// UndoSetupKernelModulesComponent undoes the work of SetupKernelModulesComponent | ||
func (b Backend) UndoSetupKernelModulesComponent(cpi snap.ContainerPlaceInfo, cref naming.ComponentRef, kernelVersion string, meter progress.Meter) error { | ||
hasModules, hasFirmware := checkKernelModulesCompContent(cpi.MountDir(), kernelVersion) | ||
var partsToClean kernelModulesCleanupParts |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
probably good to have a comment about why rerunDepmod is not set
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added a comment a bit below related to depmod - not 100% sure if that is what you meant though.
var cleanOnFailure kernelModulesCleanupParts | ||
defer func() { | ||
if err == nil { | ||
return | ||
} | ||
if err := cleanupKernelModulesSetup(&cleanOnFailure, kernelVersion, meter); err != nil { | ||
logger.Noticef("while cleaning up a failed kernel-modules set-up: %v", err) | ||
} | ||
}() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we keep doing this here? or should we make it a problem of the caller, which might decide to do a switch back instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good point as indeed in some cases we want to restore the old state and we might not want to do previously a clean-up.
|
||
var runDepmod = runDepmodImpl | ||
|
||
func runDepmodImpl(baseDir, kernelVersion string) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we have at least some tests using testutil.MockCmd to test this?
} | ||
|
||
func componentMountPoint(componentName, kernelVersion string) string { | ||
return filepath.Join("/run/mnt/kernel-modules/", kernelVersion, componentName) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still wonder if it would be beneficial or not to include the revision in this one
What: filepath.Join(componentMount, "modules", kernelVersion), | ||
Where: modulesDir, | ||
Fstype: "none", | ||
Options: []string{"bind"}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could we record somewhere a discussion of why using a bind mount and not a symlink for this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
if err := os.RemoveAll(mountDir); err != nil { | ||
return err | ||
} | ||
|
||
return nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nitpick: could be shortened if you wanted to
if err := os.RemoveAll(mountDir); err != nil { | |
return err | |
} | |
return nil | |
return os.RemoveAll(mountDir) |
// snap (this would be the output if "uname -r" for a running kernel). | ||
func KernelVersionFromPlaceInfo(spi snap.PlaceInfo) (string, error) { | ||
systemMapPathPrefix := filepath.Join(spi.MountDir(), "System.map-") | ||
matchPath := systemMapPathPrefix + "*" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
instead of guessig, could we read the snap.yaml and use the version declared there, assuming we can rely on the version being set and matching what would be found if we chopped the file names into pieces.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately uname -r
is not matching the version in snap.yaml and even if it did we might find non-Canonical kernels not following our conventions.
|
||
// Find kernel version from matching kernel | ||
spi := snap.MinimalPlaceInfo(snapsup.InstanceName(), snapsup.Revision()) | ||
kernelVersion, err := kernel.KernelVersionFromPlaceInfo(spi) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we could do kernel.ReadInfo(spi.MountDir())
here
return nil | ||
} | ||
|
||
func checkKernelModulesCompContent(mountDir, kernelVersion string) (bool, bool) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there isn't much in the way of checking, sounds more like discoverKernelModulesComContent
?
|
||
func runDepmodImpl(baseDir, kernelVersion string) error { | ||
logger.Debugf("running depmod on %q for kernel %s", baseDir, kernelVersion) | ||
stdout, stderr, err := osutil.RunSplitOutput("depmod", "-b", baseDir, kernelVersion) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
btw. do you need to pass -e
to get more useful error information maybe?
Thanks for the reviews - I am moving this to draft as we are rethinking how to do parts of this. |
Implementation of specific tasks and handlers for kernel-modules components, that need additional mount units and rebuilding modinfo files.
A spread test for this will depend on core24 changes.